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Abstract 


This  paper  describes  a  graph-based  approach  to  image  processing,  intended  for  use  with  images 
obtained  from  sensors  having  space  variant  sampling  grids.  The  connectivity  graph  (CG)  is 
presented  as  a,  fundamental  framework  for  posing  image  operations  in  any  land  of  space  variant 
sensor.  Partially  motivated  by  the  observation  that  human  vision  is  strongly  space  variant,  a, 
number  of  research  groups  have  been  experimenting  with  space  variant  sensors.  Such  systems 
cover  wide  solid  angles  yet  maintain  high  acuity  in  their  central  regions.  Implementation  of 
space  variant  systems  pose  at  least  two  outstanding  problems.  First,  such  a  system  must  be 
active,  in  order  to  utilize  its  high  acuity  region;  second,  there  are  significant  image  processing 
problems  introduced  by  the  non-uniform  pixel  size,  shape  and  connectivity.  Familiar  image 
processing  operations  such  as  connected  components,  convolution,  template  matching,  and 
even  image  translation,  take  on  new  and  different  forms  when  defined  on  space  variant  images. 
The  present  paper  provides  a  general  method  for  space  variant  image  processing,  based  on  a, 
connectivity  graph  which  represents  the  neighbor-relations  in  an  arbitrarily  structured  sensor. 
We  illustrate  this  approach  with  the  following  applications:  (1)  Connected  components  is 
reduced  to  its  graph  theoretic  counterpart.  We  illustrate  this  on  a  logmap  sensor,  which 
possesses  a.  difficult  topology  due  to  the  branch  cut  associated  with  the  complex  logarithm 
function.  (2)  We  show  how  to  write  local  image  operators  in  the  connectivity  graph  that 
are  independent  of  the  sensor  geometry.  (3)  We  relate  the  connectivity  graph  to  pyramids 
over  irregular  tessalations,  and  implement  a  local  binarization  operator  in  a  2-level  pyramid. 
(4)  Finally,  we  expand  the  connectivity  graph  into  a  structure  we  call  a  transformation  graph, 
which  represents  the  effects  of  geometric  transformations  in  space  variant  image  sensors.  Using 
the  transformation  graph,  we  define  an  efficient  algorithm  for  matching  in  the  logmap  images 
and  solve  the  template  matching  problem  for  space  variant  images.  Because  of  the  very  small 
number  of  pixels  typical  of  logarithmic  structured  space  variant  arrays,  the  connectivity  graph 
approach  to  image  processing  is  suitable  for  real-time  implementation,  and  provides  a.  generic 
solution  to  a  wide  range  of  image  processing  applications  with  space  variant  sensors. 
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1  Introduction 

W  e  axe  accustomed  to  thinking  of  an  image  a.s  a  rectangular  grid  of  rectangular  pixels.  In  such 
an  image,  connectivity  and  adjacency  a, re  simply  defined  in  terms  of  the  four  or  eight  neighbors 
of  each  pixel.  But  in  biological  vision  systems  [10]  [34]  [38]  as  well  as  space  variant  artificial 
image  sensors,  based  on  CCD  [19]  [33]  [39]  ,  MOS  [8]  [50],  and  firmware  [7]  implementations, 
there  exists  a  pattern  of  photodetectors  having  spatially  changing  size,  shape  and  connectivity 
across  the  image  array.  One  motivation  for  the  study  space  variant  sensing  derives  from  its 
prominent  role  in  the  architecture  of  the  human  visual  system,  and  so  space  variant  sensors 
axe  sometimes  called  foveated  or  retinal.  Their  elegant  mathematical  properties  for  certain 
visual  computations  also  motivates  the  development  of  space  variant  sensors,  and  so  they  axe 
sometimes  called  log  polar ,  log  spiral ,  or  logmap  sensors.  Except  where  we  note  explicitly, 
we  use  the  term  logmap  to  refer  to  a.  visual  sensor  whose  outstanding  characteristic  is  that  a, 
wide  angle  workspace  may  be  covered,  with  the  capability  for  high  resolution  sensing  at  the 
center  of  the  array,  incurring  a.  small  pixel  sensing  and  processing  burden  proportional  only  to 
the  logarithm  of  the  size  of  the  workspace  [28]  [27]  [14]  .  The  special  relationship  of  logmap 
sensors  to  the  human  visual  system  is  a  constant  of  nature  analogous  to  the  frame  rate  constants 
in  motion  pictures  and  TV,  and  some  researchers  have  exploited  this  property  to  and  build 
multiresoultion  head-tracking  viewer- centered  displays  [37]  [49]  [48]  and  low-cost  computer 
graphic  displays  based  on  the  logmap  [15]  [20].  We  have  applied  the  logmap  architecture  to 
a  very  low  cost  moving  car  license  plate  identification  and  tracking  system  [24]  [25]  and  to  a 
prototype  consumer  video  telephone  [40]. 

A  fundamental  problem  with  logmap  sensors  arises  from  their  varying  connectivity  across 
the  sensor  plane.  Pixels  which  axe  neighbors  in  the  sensor  axe  not  necessarily  neighbors  once  a. 
computer  reads  the  data  into  an  array,  making  it  difficult  or  impossible  to  apply  conventional 
image  array  operations.  Some  sensors  integrate  a.  central  uniform  resolution  region  inside  an 
annulus  of  logarithmic  structured  pixels,  leading  the  problem  of  how  to  combine  image  data 
from  the  two  grids.  This  paper  develops  a  data  abstraction  called  the  connectivity  graph  (CG), 
which  represents  explicitly  the  connectedness  relations  in  arbitrary  space  variant  images  (see 
also  [41]  [42]  [43]  ).  Using  the  graph,  we  define  a  variety  of  sensor-independent  image  processing 
operations.  The  connected  components  problem  for  space  variant  images  then  reduces  to  its 
graph  theory  counterpart.  Local  image  operations  are  defined  as  a  function  of  a,  pixel  and 
its  neighbors,  which  turns  out  to  have  the  additional  effect  of  eliminating  special  cases  for 
boundary  conditions.  We  discuss  typical  local  operations  such  a.s  edge  detection,  smoothing, 
and  relaxation.  Building  on  the  work  reported  by  Montan  vert  et  at.  [22],  we  define  a.  pyramid 
structure  for  space  variant  images  and  apply  it  to  a,  local  binarization  operation.  Finally, 
we  introduce  a,  class  of  variations  of  the  CG  which  we  call  transformation  graphs ,  and  which 
represent  the  effects  of  image  transformations  such  a.s  translations,  rotations  and  scalings.  We 
demonstrate  an  implementation  of  template-matching  using  the  transformation  graphs. 
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1.1  Background 

Since  the  size  and  cost  of  a.  machine  vision  system,  and  the  motors  which  drive  it  if  it  is  active, 
are  scaled  by  the  numbers  of  pixels  it  must  process,  space  variant  active  vision  provides  the 
possibility  of  radical  reductions  in  system  size  and  cost.  We  have  built  a  32  frames  per  second 
(fps)  real-time  computer  vision  hardware  system  based  on  the  logmap  pixel  geometry  [5]  [8] 
[3]  [7]  [4].  The  choice  of  the  logmap  had  two  direct  architectural  implications.  First,  there  is 
a,  tremendous  reduction  in  the  sensor  data  rate.  The  low  data  rale  enables  real-time  logmap 
image  processing  routines  to  run  on  modest  microprocessors,  and  allows  inter-processor  image 
data  communication  on  low  capacity  channels.  The  low  data  rate  also  enables  our  consumer 
video  phone  to  transmit  logmap  images  at  4  fps  over  ordinary  voice  telephone  lines  [40].  The 
second  major  impact  of  the  logmap  results  from  the  fact  that  the  peripheral  pixels  are  in  effect 
low  pass  fdters.  Threfore,  we  can  allow  the  periphery  to  be  blurry,  with  the  consequence  the 
logmap  sensor  utilizes  a  smaller  lens  than  a  video  sensor,  which  reduces  the  size  and  cost  of 
the  camera  and  its  actuator. 

With  the  sensor  mounted  on  a  very  low  cost  camera  pointing  actuator,  the  Spherical  Point¬ 
ing  Motor  [9]  [6],  the  total  system  cost  was  two  orders  of  magnitudes  lower  than  a  compara  ble 
system  based  on  conventional  video  sensors  and  more  elaborate  actuators. 

We  want  to  state  clearly  that  our  system  emulates  a  logmap  sensor  with  an  embedded  DSP 
and  a  conventional  CCD.  Specifically,  an  AD  2101  DSP  reads  pixel  values  digitized  from  the 
192  x  165  Texas  Instruments  CCD  chip  TC211,  and  averages  CCD  pixels  to  form  a  1376  pixel 
logmap.  But  the  DSP  is  fast  enough  to  run  the  readout  at  32  fps,  so  the  emulated  sensor  is, 
in  effect,  a  sensor.  The  sensor  emulator  contained  only  commodity  integrated  circuits,  which 
eliminated  the  need  for  costly  custom  VLSI. 

Constructed  at  very  low  cost,  systems  of  this  type  may  open  up  new  niches  for  machine 
vision,  and  can  only  be  built,  at  present,  using  space  variant  sensing  to  limit  the  size  and 
cost  of  the  motors,  memory,  CPUs,  and  interprocessor  communications.  For  these  reasons, 
space  variant  sensors  in  general,  and  logarithmic  sensors  in  particular,  have  begun  to  attract 
attention  in  the  machine  vision  community  [2]  [11]  [12]  [16]  [18]  [19]  [23]  [21]  [26]  [28]  [27]  [32] 
[33]  [39]  [36]  [44]  [46]  [45]  [47]  [50].  The  sensor  geometry  motivating  much  of  the  work  in  this 
paper  is  given  by  Rojer  [28]  [27]  . 

The  basis  of  our  approach  to  space  variant  image  processing  is  the  use  of  a  CG  to  represent 
the  neighbor  relations  between  pixels  in  space  variant  images.  Rosenfeld  was  the  first  to  apply 
the  topological  notion  of  connectivity  to  image  processing  [29].  The  standard  image  processing 
texts  (e.g.  [30]  vol.  2  p.  208,  [17]  p.  67 ff )  discuss  the  connectivity  paradox  for  digital  images, 
namely  that  the  Jordan  curve  theorem  does  not  hold  when  the  foreground  and  background  a.re 
both  four  or  eight  connected.  These  texts  also  mention  connectivity  under  a  hexagonal  pixel 
geometry.  Rojer  [28]  [27]  was  apparently  the  first  to  use  a  connectivity  graph  to  implement 
local  image  processing  operations  in  a  space  variant  image.  The  application  of  graphs  to 
represent  arbitraxy  neighbor  relations  for  region  segmentation  is  also  discussed  in  the  sta.nda.rd 
texts  (e.g.  [1]  p.  159ff).  Recently,  Montanvert  et  al.  [22]  have  shown  how  to  define  an  image 
pyramid  recursively  from  an  irregular  tesselation. 
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Figure  1:  (a)  a.  TV  image,  / (i.  j),  (b)  The  inverse  logmap  image  L ~ 1  (i.  j),  and  (c)  The  forward 
logmap  image  L(u.  v).  We  enlarged  the  printed  pixels  in  the  logmap  (c).  Each  logmap  pixel 
requires  one  byte  of  memory,  and  the  logmap  (c)  represents  a  data  reduction,  compared  with 
(a),  of  30  :  1. 

1.2  Space  variant  image  sensors 

The  CG  allows  us  to  write  image  processing  routines  that  are  independent  of  a  particular 
sensor  geometry.  To  formalize  the  idea  of  ''sensor-independent'’  algorithms,  we  need  to  make 
a  distinction  between  three  types  of  images.  Figure  1(a)  is  a  conventional  raster,  video,  or  TV 
image.  Figure  1(b)  is  called,  for  reasons  given  below,  an  inverse  logmap  image.  The  (forward) 
logmap  image  appears  in  Figure  l(c).  These  three  types  of  image  are  related  by  a  pair  of 
lookup  tables  S  and  R,  which  map  TV  image  coordinates  to  logmap  array  coordinates.  If  we 
obtain  the  logmap  image  from  a  conventional  video  sensor,  as  in  our  miniaturized  system  [8], 
then  the  lookup  tables  S  and  R  tell  us  which  aggregate  set  of  TV  pixels  to  average  in  order  to 
form  a  logmap  pixel. 

But  it  is  worth  emphasizing  that  even  if  we  had  a  VLSI  logmap  sensor,  we  would  like  to 
display  the  logmap  images  in  an  inverse  logmap  format,  like  the  one  in  Figure  1  (b).  Therefore 
for  all  conventional  raster  displays,  and  for  firmware  logmap  readouts,  we  need  to  think  of  a 
logmap  in  relation  to  a  TV  screen  or  a  a  TV  image  sensor.  Moreover,  the  TV  pixel  repre¬ 
sentation  of  the  logmap  sensor  geometry  is  the  basis  for  an  algorithm,  introduced  below,  to 
construct  the  CG  for  an  arbitrary  sensor  geometry. 

We  define  the  logmap  formally  as  a  mapping  from  a  TV  raster  image  /(L  j)  where  i  E 
{0. ....  m  —  1 }  and  j  E  {0. ....  n  —  1 }  to  a  logmap  array  L(u.  v),  with  a  E  {0. ....  *  —  1 }  and 
v  E  {().....?•  -  l  ).  The  mapping  is  specified  by  two  lookup  tables  S(i,j)  and  R(i:  j),  whose 
names  are  chosen  as  mnemonics  for  '%poke!'  and  *'ringr  logmap  indexes  (see  Figure  2  )  Let 
a(u.v)  be  the  area  (in  TV  pixels)  of  a  logmap  pixel  (•«.  v). 

a(a.  v)  =  1  |  S(i.j)  =  a  and  R(i.  j)  =  v. 

G 

Logmap  pixels  (u:  v)  for  which  |  S(itj)  =  u  and  R(i.  j)  =  v  are  undefined  and  we 
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Figure  2:  The  lookup  table  of  logrnap  pixels  from  TV  pixels.  The  logmap  ha.s  16  spokes  and  20 
rings,  but  a  total  of  only  272  pixels.  The  shift  factor  alpha  =  4.  (The  lookup  indexes  are  only 
plotted  for  logmap  pixels  having  a.rea.  at  least  10  TV  pixels.)  The  pixel  (0,  1)  has  8  neighbors, 
pixel  (1,  1)  has  7  neighbors,  but  pixel  (1,  0)  has  only  4  neighbors.  Logmap  pixel  (0,  0)  is 
connected  to  (0,  19)  across  the  vertical  meridian,  but  unconnected  in  the  Logmap  array. 
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adopt  the  convention  that  for  those  pixels  a(u.v)  —  0.  We  observe  tha.t  if  k  is  the  number 
pixels  for  which  a(u.  v)  >  0,  then  k  <  sr.  The  logmap  image  is  defined  where  a(  u.  v)  >  0  by 

1 

L(u.  v)  =  — - -  y |  S(i.j)  =  u  and  R(i.j)  =  v. 

(J\  it.  v)  j 

in  other  words,  the  average  of  aggregate  sets  of  TV  pixels.  The  inverse  login  a, p  is 

L-1(i9j)  =  L(S(i9j),IHi9j))9 

provided  S(itj)  and  R(i.j)  are  defined.  The  inverse  logmap  function  converts  a  logmap  image 
to  TV  coordinates  for  display. 

To  see  how  to  construct  the  lookup  tables  S  and  R.  we  look  at  the  example  of  a.  logmap 
sensor  given  by  the  complex  mapping  w  =  log(z  +  a).  Figure  3  illustrates  the  difference 
between  a  sensor  defined  by  w  =  log(z)  and  one  defined  by  w  =  log(z  +  a).  For  simplicity, 
we  restrict  our  attention  to  logmap  sensors  having  four- way  symmetry,  i.e.  the  pixel  geometry 
is  reflected  about  the  vertical  a.nd  horizontal  meridians.  We  construct  the  4- way  symmetric 
logmap  by  first  computing  the  one-quadrant  mapping  given  by  a  pair  of  lookup  tables  Sq(x9  y) 
and  RQ(xfy),  then  obtaining  S  a.nd  R  by  a  simple  symmetry  procedure.  The  lookup  tables 
Sq (xfy)  and  Rq(x,  y)  have  their  indicies  ;i-  E  {(), ....  ^  -  1}  and  y  E  {(). ....  y  -  1}  transposed 
from  matrix  coordinates  to  Cartesian  coordinates. 

The  basic  idea  behind  the  sensor  geometry  is  to  treat  each  TV  pixel  coordinate  (xfy)  a.s  a 
complex  number  z  =  x  +  iy,  then  to  calculate  the  complex  point  w  =  log(z  +  a),  and  then  to 
normalize  w  to  the  integer  array  indexes  for  Sq  and  Rq. 

If  we  assume  that  there  a.re  no  more  rows  than  columns  (to  <  «),  the  minimum  value  of 
the  log  function  occurs  in  the  pixel  (0,0)  and  its  maximum  value  is  where  the  outermost  ring 
intersects  the  y  axis.  We  denote  the  maximum  and  minimum  values  a.s 

ftnax  =  log  |(a  +  y  y  -  j)|,  and 
9min=  log  |(«  +  y  §)| 

respectively.  We  add  the  offsets  of  |  to  compute  the  lookup  table  from  the  center  of  each  pixel, 
rather  than  the  corner.  This  has  the  beneficial  effect  of  preventing  the  axial  pixels,  and  the 
origin,  from  being  replicated  when  we  compute  the  4-wa,y  symmetric  tables. 

The  minimum  value  of  the  spoke  angle  occurs  on  the  y  axis  and  is  given  by 

0min  =  tan-1  (a  +  y  y  -  |) 

Then  V(ar,  y)\x  E  {0, ....  f  —  1}  and  y  E  {0, ....  y  —  1}  we  let 
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(d) 


Figure  3:  Two  types  of  space  variant  sensor  defined  by  the  mapping  w  =  log(z  +  a)  (see 
text).  The  mapping  transforms  the  point  P±  into  the  point  Pn-  One  wa.y  to  easily  visualize  the 
w  =  log(z  +  a)  sensor  is  to  think  of  cutting  a  strip,  shown  by  the  hatched  area  of  (a),  out  of 
the  middle  of  a,  sensor  geometry  defined  by  w  —  log(z). 
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p  =  |(®  +  «  +  v  +  \) 


so  that 


We  a, Iso  let 


Rq{*>v)  = 


r  (*Og(/>)  ~  ffmin) 

- 2  (ihnax  —  ^min)  - 


(1) 


0  =  tan  1(x  +  a+\f  y  +  J) 


so  that 


s{B  -  emin) 

.  7T  -  20min  . 


(2) 


Where  Rq{xty)  >  J  we  change  the  values  of  and  Sq  to  undefined.  A  4-wa.y  symmetry 
procedure  obtains  the  full  frame  lograap  lookup  tables  S  and  R  in  TV  coordinates: 

£  {(),..  >,f~  1}  and  j  £  {(), ...,§-  1}  and  RqU-i)  is  defined  let 


S(f- 1 

-hf-i-i 

=  m 

- 1 

-h  f+j) 

=  Sq(J9  i)f 

S(f  +  i, 

S-i-i) 

=  S(  f 

+  if 

f+i) 

II 

C* 

l 

1 

■o 

R(f  -  1 

-bf  -i-i 

=  R(f 

+  i. 

f-i-j) 

II 

►OH 

■o 

c: 

R(f  -  1 

~  h  S  +  i) 

=  R{f 

+  i. 

J  +  i) 

—  h  +  RqU>  *)• 

We  make  the  undefined  region  4-wa.y  symmetric  similarly.  For  more  general  mappings,  we 
can  think  of  the  image  in  of  Figure  1(b)  as  what  the  sensor  sees  and  the  one  in  Figure  1(c)  as 
a,  representation  of  that  image  inside  computer  memory,  that  is,  as  an  a.rra.y.  Not  all  cells  in 
the  array  correspond  to  sensor  pixels:  the  cells  between  the  “butterfly  wings"  of  the  complex 
log  mapping  do  not  correspond  to  any  pixel  in  the  sensor.  Moreover,  those  pixels  along  the 
vertical  meridian  of  the  sensor  are  adjacent  across  the  meridian,  but  they  are  separated  in  the 
memory  image.  If  we  try  to  apply  conventional  image  processing  routines  to  data,  like  that 
from  this  sensor,  they  produce  undesirable  artifacts  along  the  vertical  meridian. 

One  possible  solution  to  this  boundary -effect  problem  is  to  duplicate  the  meridian  pixels 
in  the  image  array.  This  approach  is  especially  attractive  for  CCD  sensors  based  on  the  log(z) 
mapping,  because  timing  requirements  of  CCD  imaging  result  in  a  cheap  readout  technique 
to  store  sr  pixels  duplicated  in  a  size  2 sr  memory  buffer.  But  this  buffer  does  not  capture 
connectivity  across  the  sensor’s  center.  Examination  of  Figure  2  shows  that  for  the  more  general 
log(z  +  a),  array  adjacency  alone  cannot  capture  all  the  neighbor  relations.  Duplicate-pixel 
representations  also  pose  undue  complexities  for  iterative  relaxation  procedures  like  the  soa.p 
bubble  algorithm  described  below.  These  connectivity  problems  arise  not  only  in  the  log (z  +  a) 
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sensor  (right). 


pixel  a.ircii..i1;ec1; ure.  but  also  in.  1  ;h.ose  hybrid  sensors  composed  of  a  log(z>  architecture  combined 
with.  a  central  foveal  area  having  uniform  resolution.. 

More  generally;.  suppose  we  have  an.  image  senior  in.  which.  i;h.e  pixels  have  no  regular 
pattern,  such  as  the  one  illustrated.  In.  Figure  1  .  In.  this  case  there  Is  n.o  obvious  way  to  map 
the  pixels  into  a  two  dimensional  array.  Yet  if  we  had  such  a  sensor,  we  might  still  he  motivated. 


to  do  image  processing  with  it.  In.  this  paper  we  show  that  image  processing  operations  can.  in. 


fact  he  defined,  on.  such  images,  given,  a  systematic  approach  to  the  connectivity  problem. 


2  The  connectivity  graph 

In.  this  section  we  give  a  definition,  of  the  CG  and  then,  show  how  to  use  it  to  solve  the  connected 
components  problem  in.  a  Iogmap  image. 

2,1  Defining  the  connectivity  graph 

The  CG  is  a  graph  G  =  (V,  E)  whose  vertices  V  stand  for  sensor  pixels  and.  whose  edges  E 
represent  the  adjacency  relations  between  pixels.  Figure  5  illustrates  the  CG  for  our  Iogmap 
sensor  geometry.  Associated  with  a  vertex  p  is  a  iogmap  pixel  address  ( u,  v).  Thus  we  write 
(ii(p).v(p))  for  a  pixel  coordinate  identified,  by  its  graph,  vertex,  or  p  ==  <b(u, v)  for  a  vertex 

identified  by  its  pixel  coordinate.  Then,  V  =  fpo . ,  where  \  V\  =  k  is  both,  the  number 

of  iogmap  sensor  pixels  and  the  number  of  vertices  in.  V.  We  use  the  expression.  M(p)  to  refer 
to  the  set  of  pixels  adjacent  to  p: 

M(p)=  U  \  (m)  C-  E} . 

We  discussed  the  example  of  a  Iogmap  generated,  by  the  expression  w  =  log(  z  +  in  the 
previous  section.  The  lookup  table  depicted  in.  Figure  2  illustrates  that  some  Iogmap  pixel 
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Figure  5:  The  CG  for  the  sensor  depicted  in  Figure  1.  Circles  stand  for  graph  vertices  and 
lines  stand  for  graph  edges.  The  plot  (a.)  shows  the  vertices  in  login ap  array  coordinates,  and 
the  plot  (b)  shows  the  same  graph  in  sensor  coordinates.  The  graph  explicitly  represents  the 
neighbor  relations  across  the  vertical  meridian. 
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pairs  axe  8-connected  both  in  the  logmap  and  in  the  TV  coordinate  system,  but  other  pairs 
axe  connected  only  in  the  logmap.  We  would  like  the  connectivity  graph  edge  set  E  to  include 
both  sets  of  neighbor  relations.  Let  Ep  be  the  set  of  edges  representing  connections  between 
pixels  in  the  forwaxd  mapping.  An  edge  {p.q)  G  Ep  provided  (u{p).v(p))  is  an  8-(4-)  neighbor 
of  {u{q),v{q)). 

For  the  logmap  sensor  in  Figure  1,  Ep  does  not  contain  the  connections  across  the  vertical 
meridian.  To  obtain  all  the  connections  in  the  sensor,  we  place  the  inverse  image  adjacency 
relations  in  the  edge  set  Ep. 

(p,  q)  G  Ej  provided  3  i.j.io.ja  such  that 

S{i,j)  =  u{p),R{i.j)  =  v(p)  and  , 

S{io,jo)  =  u{q).R{i0Jo)  =  v(q)  and  1 

(i.j)  is  an  8-  (4-)neighbor  of  (hn  jo)- 

For  the  logmap  image  in  Figures  1  and  2  the  CG  edge  set  is  E  =  Ej  U  Ep. 

For  an  arbitrary  pixel  geometry  sensor  illustrated  in  Figure  4,  the  construction  of  the  edge 
set  E  differs  slightly.  The  forwaxd  map  for  an  k  pixel  sensor  of  this  type  is  simply  a,  ^-element 
array,  but  neighbor  relations  in  that  axra.y  axe  not  necessaxily  related  to  connectedness  of 
the  sensor  pixels.  When  the  sensor  mapping  is  given  by  lookup  tables  S  and  R  that  do  not 
preserve  adjacency  in  the  forwaxd  map,  like  the  one  illustrated  in  Figure  4,  we  set  the  CG  edge 
set  E  =  Ej. 

Given  the  lookup  tables  S  and  R.  the  algorithm  to  compute  the  connectivity  graph  is 
simple.  One  vertex  p  G  V  is  allocated  for  each  pixel  {u.  v).  The  graph  edges  in  Ep  axe  just 
the  8-  or  4-  edges  from  the  forwaxd  map  image.  The  graph  edges  in  Ej  axe  computed  by 
scanning  the  lookup  tables  S(i.j)  and  R{i,j)  with  a  2  x  2  operator. 

The  CG  is  a  data  abstraction  for  space  variant,  image  data,  and  as  such  it  may  be  imple¬ 
mented  in  a  variety  of  ways,  including  the  computation  of  neighbor  relations  on  line.  When 
the  logmap  geometry  is  given  by  functions  such  a.s  (1)  and  (2),  we  can  transform  the  them  into 
analytic  functions  by  removing  the  [-J  integer  transform,  and  then  obtain  neighbor  relations 
by  searching  for  pixel  indexes  satisfying  (3).  Another  possibility  is  computing  all  the  neighbor 
relations  in  advance,  then  storing  them  in  discrete  graph  structures.  Our  most  efficient  imple¬ 
mentation  of  CG  operations  uses  a  hybrid  of  both  approaches.  Examination  of  Figure  5  shows 
that  it  is  only  the  boundary  pixels  that  have  special  cases  of  connectivity  relations,  whereas 
the  interior  pixels,  which  cover  most  of  the  logmap  a.rea.,  all  have  the  usual  8-connectedness. 
Therefore  at  run  time  a  single  branch  instruction  determines  whether  a.  pixel  is  interior  or  ex¬ 
terior,  and  for  the  former  it  calculates  neighbor  storage  locations  by  efficient  autoincremental 
addressing.  Computations  on  the  boundary  of  the  CG  use  pre- calculated  lists  of  neighbors  to 
retrieve  data,  from  them  efficiently. 

Other  operations  we  define  on  the  graph  node  p  axe  the  pixel  array  coordinates  (u(p).  v(p)), 
the  pixel  intensity  “  L(p)  —  Liuip).  v(p)),  the  pixel  centroid  fi(p).  the  pixel  area  a(p)  — 

2  We  have  intentionally  overloaded  the  definition  of  LQ.  The  two-argument  L(u,  v)  is  pixel  brightness  in  the 
array  representation,  but  the  one-argument  L(p)  is  pixel  brightness  in  the  CG  representation. 
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Figure  in  Tie  connectivity  graph  (b)  for  a  CMOS  iogmap  sensor  made  by  Synaptics-  lac.  The, 
center  pixels  form  a  .rectangular  grid*  and.  these  are  surrounded.  by  annular  rings  of  increasing 
s3sse.  The  .image  (a)  snows  a  simulation.  of  this  sensor. 


n(v(p),  ?*(nY),  and.  tine  number  of  neighbor*  |A/'(p)|,  The  pixel  centroid.  t4p)  5s  given,  by 


1 


l4?) ::::  7777  Tjujf  ;  R&uj }  ::::  V-iP}  end.  S(uj)  ss  v(p) 


HP)  * 


We  and.  other*  have  designed,  and  fabricated,  a  variety  of  space  variant  sensors  [12]  [28]  [27] 
[32]  [33]  [39] .  A*  illustrated,  in  Figures  6  an.d.  7  there  exists  a  CG  for  each  of  these  architectures. 
We  will  use  the  CG  to  show,  in.  the  next  section.,  now  to  develop  a  library  of  image  processing 
routine*  that  can  run.  on  data  from  any  of  these  sensors-  and  therefore  that  the  CG  is  a  generic* 
solution  to  image  processing  problems  under  an.  unconstrained  Image  sensor  geometry.  But 
first,  we  will  describe  the  image  processing  problem  that  led  us  to  connectivity  graphs  in.  the 
first  place,  namely,  connected  components  analysis. 


2,2  Connected  components 

The  connected,  components  problem  begin*  with  a  given,  predicate  F, for  example  the  foreground- 
back  ground,  function 


■F(>,  <*) 


true  L(p)  =  Up) 
false  otherwise 


true  in.  a  binary  image  when  the  two  pixels  p  and  q  are  either  both  black  or  both  white. 
The  algorithm  to  find  all  connected,  components  of  pixels  satisfying  F  is  well-known  [1]  [17] 
[30]  .  In  fact  we  use  the  familiar  graph  theoretic  algorithm  [13].  First  we  break  the  CG  into 
subgraphs  by  removing  those  edges  between,  pixels  where  F{p, q)  does  not  hold.  Then,  by 
the  connectedness  and.  components  algorithm  for  graphs,  our  program  extracts  the  connected. 


regions.  Figure  8  shows  the  .result. 
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F.i^u.ire  7:  Tne  connectivity  graph.  of  a  log  polar  sensor  given  by  the  mapping  w  =  iog(zk  where 
the  origin  is  n.ol;  in.  the  domain  of  the  mapping.  The  connectivity  graph  is  plotted.  Jn  .inverse  map 
coordinates  (i>)  and  in  forward  map  coordinates  (c>.  For  this  type  of  sensor,  a  horizontal  sibJft 
.in.  forward  coordinates  k  equivalent  to  an  .image  scaling,  and  a  vertical  shift  is  equivalent  to 
an  image  rotation.  Note  that  the  CG  .representation  mabes  explicit  the  wraparound  necessary 
for  image  rotation.,  and  also  tne  sparseness  of  pixels  near  tne  origin. 
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Figure  8:  The  CG  reduces  the  image  connected  components  problem  to  its  graph  theoretic 
counterpart.  Here  we  show  the  foreground- background  segmentation  of  an  image  containing 
the  letter  “A”. 


o< 
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3  Local  neighborhood  operations 

Using  the  connectivity  graph,  we  can  define  local  neighborhood  operations  independently  of 
the  number  of  neighbors,  and  therefore  without  arbitrary  boundary  effects.  Our  CG  approach 
generalizes  local  image  processing  operations  to  arbitrary  sensor  geometries. 


Figure  9:  (a)  A  Jogmap  image  having  1376  pixels  and  (b)  the  result  of  a  CG  local  edge 
magnitude  operation,  (e)  The  same  edge  image  image  in  forward  map  coordinates. 

We  can  define  a.  simple  edge  detector,  the  effect  of  which  Figure  9  illustrates,  as 


e(p)  = 


\M(p)\ 


E  dip)  -  wf' 

?eAr(p) 


Note  that  the  definition  of  e(p)  contains  no  special  cases  for  pixels  having  different  number 
of  neighbors,  and  thus  applies  equally  to  pixels  at  the  boundary  of  the  image  and  to  those  in 
the  interior. 


Figure  10:  (a)  The  result  of  a  CG  pseudo- La placian  local  operation,  (b)  The  same  image 
image  in  forward  map  coordinates. 
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Another  simple  example  is  the  pseudo-Laplacian  /(p),  illustrated  in  Figure  10,  given  by 

J(P)  =  HP)-Tr^  E 

|A  U  j!  izA/ip) 

Of  course,  computing  the  actual  Laplacian  on  the  sensor  array  requires  transforming  expres¬ 
sions  like  (1)  and  (2)  into  analytic  functions  a.nd  solving  for  the  second  derivatives. 

We  can  generalize  these  definitions  to  take  into  account  other  pixel  properties  such  a.s  the 
relative  differences  in  pixel  size  a(p),  or  the  pixel  mean  p(p).  For  example  we  could  have 
defined 

f(p)  =  L(p)  —  E  ?)-%)/  E  u(ptq) 

(p)  q€,\f(p) 

where  w(p,  g)  =  a(q)/\t*(p)  -  /*(g)|. 

A  slightly  more  complicated  definition  is  illustrated  by  a.  plane-fitting  edge  operator.  Given 
the  d  =  j,V(p)l  +  1  points 


{qu...qd}  =  {p}  U,V(p) 

and  a  vector  of  their  gray  values  A  =  {L(qi). . . .,  F(<p/)),  fit  a  plane  h(i9j)  =  a  •  (i.j.  1)T  to 
minimize  some  error  measurement  .  Applying  the  least  squares  error  criteria  for  linear  equations 
[17],  we  can  find  a  plane  by  solving  for 


A  =  aB  where 


B  = 


/ 

\  1 


The  solution  of  a  for  \M (p)|  >  2  is 


t*(9d) 

1 


a  =  ABt(BBt)-1 

If  not  ail  points  are  colinear,  (BB7)-1  always  exists. 

Since  M  =  B  r(  BB7  )-1  is  constant  for  each  pixel  p,  we  precompute  and  store  this  matrix 
M(p)  for  every  graph  vertex  p.  At  run  time,  we  only  need  to  compute  AM(p)  for  every  graph 
node  p. 

A  simple  iterative  relaxation  procedure,  the  soap  bubble  algorithm,  is  also  defined  naturally 
on  the  CG.  If  p  is  a  seed  point,  then  Lt+i(p)  —  To(p)?  otherwise 
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The  definitions  given  so  fax  have  expressed  functions  of  a  pixel  and  only  its  immediate 
neighbors.  Sometimes  a  local  operator  may  map  from  a  larger  neighborhood,  however.  One 
helpful  function  in  defining  such  operations  is  the  A1 2  function. 

Af 2 (p)  =  {b  |  b  £Af(q)  and  q  E  Af(p)  and  b  £  Af(p)}. 

The  2  in  A!  2 (p)  denotes  the  fact  that  these  graph  nodes  are  two  edges  away  from  the  vertex  p. 
In  this  way,  we  say  that  A;i(p)  =  Af{p)>  We  can  go  on  to  define  A/3 (p),  A  4{P)  etc.  similarly. 
Now  we  can  define  a  smoothing  operator  c(p)  over  an  extended  neighborhood: 


c(p)  = 


|Ari(p)|  +  IA^Cp)!  + 1 


(L(P)+  E 


LW)) 


qtA' \{p)vAf zip) 


The  local  operations  defined  above  are  independent  of  the  sensor  geometry  and  its  size.  In 
software  simulations,  we  have  experimented  with  a  variety  of  space  variant  sensors.  Because 
they  are  all  defined  by  the  lookup  tables  R  and  S.  our  program  can  automatically  compute 
the  CG  a, nd  apply  any  of  the  local  operations  to  dat  a,  from  any  of  the  sensors. 


4  Pyramid  operations 

Following  the  method  of  Mont a.n vert  et  al.  [22],  we  can  recursively  define  a  pyramid  structure, 
each  layer  of  which  is  a  general  CG.  This  structure  gives  us  a  natural  way  to  define  partitions 
for  a  space  variant  sensor.  A  pyramid  structure  over  anirregular  tessela.tion  may  be  visualized 
and  represented  by  a  graph.  Each  layer  of  the  pyramid  is  similar  to  a  connectivity  graph,  but 
each  of  the  vertices  p  in  the  pyramid  are  connected  between  levels  by  a  directed  edge  called 
pareni(p). 

We  call  the  bottom  layer  of  the  pyramid  level  0  and  call  each  successive  layer  1,2...  etc. 
The  parent  edges  at  level  t  are  directed  from  level  l  to  level  l  +  1.  We  denote  the  vertex  set  at 
level  t  with  F(f).  The  function  parent\p)  ha.s  the  property  that  if  q  =  parent(p)  and  p  E  V[t) 
then  q  E  V(l  +  1).  Moreover,  every  vertex  p  has  exactly  one  parent^p)  except  for  the  vertices 
at  the  highest  pyramid  level,  where  parent{p)  is  undefined. 

Given  the  graph  G(t)  —  {E(t)9V(t))  at  level  t  ,  we  define  a  recursive  graph  reduction  to 
obtain  G(t  +  1)  =  (£-(£  +  1),  V(i  +  1))  at  level  l  +  1.  The  graph  G(0)  is  the  same  a.s  the  CG, 
except  that  each  vertex  p  E  V(0)  has  the  new  edge  relation  parent.  The  vertices  in  successive 
levels  have  the  property  that 

q  E  V(l  +  1)  iff  3p  E  V(t)  |  parent(p)  =  q. 

It  follows  that  \V(£  +1)|  <  | V(f)|,  but  we  usually  restrict  our  attention  to  the  case  that 
\V(£+t)\  <  \V{t)\. 

To  obtain  the  connectivity  (i.e.  non-parent)  edges  we  apply  the  recursive  graph  reduction 
procedure  given  in  [22]  : 
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(p.  q)  £  E((  +  1)  provided  p  £  V((.  +  1)  and  q  £  V(l  +  1)  and 
3  po,qo  such  that 

parent(po )  =  p  and  parent(qo)  =  q  and 

(po>  go)  e  E(l). 

4.1  Sensor  data  partitions 

A  2- level  pyramid  defines  a  partition  Pi,.. . .  P\v(i)\  the  sensor  pixels,  where  P/,  is  the  set 
{p  |  parent{p )  =  pi,).  We  call  the  region  formed  by  any  of  these  partitions  Pi,  the  focal 
region.  The  broad  definition  of  the  recursive  graph  reduction  represents  the  many  ways  to 
partition  the  sensor  data.  If  we  want  to  perform  operations  on  partitions  that  depend  on  the 
number  of  pixels  in  them,  we  partition  the  sensor  into  connected  regions  having  approximately 
the  same  number  of  pixels.  Another  choice  is  to  partition  the  sensor  data  into  sets  of  pixels 
having  approximately  the  same  area..  The  first  choice  is  better  when  we  need  a.  statistically 
significant  set  of  pixels  in  order  to  find,  for  example,  a  reliable  local  threshold  value.  One 
method  recursively  uses  the  centroid  of  all  the  pixels  under  consideration  a.s  a.  reference  point 
and  repeatedly  splits  the  sensor  data,  into  four  smaller  sets,  terminating  when  it  reaches  a, 
minimum  pixel  count.  This  pyramid  is  similar  topologically  to  the  quadtree  decomposition 
[31]  for  conventional  video  images. 


4.2  Adaptive  local  binarization 

One  application  of  the  pyramid  is  local  binarization.  To  binarize  an  image,  we  compute  a  local 
threshold  value  for  each  local  region,  based  on  the  method  proposed  by  Shio  [35].  He  used  an 
illumination-independent  contrast  measure  to  verify  the  reliability  of  a  local  threshold  values. 
His  reliability  index  r(P)  for  a  given  local  region  P  is  defined  a.s 


r(P)  = 


°(P) 
H(P)  ~  H 


where  <r(P)  and  p(P)  are  the  mean  and  standard  deviation  of  intensity,  respectively. 

The  parameter  j3  depends  on  the  video  camera  and  A/D  converter  used.  To  simulate 
our  space  variant  sensor  for  this  experiment,  we  used  a  Sony  XC-77RR  CCD  camera  and  an 
Analogic  DASM  frame  grabber.  From  an  experiment  similar  to  the  one  reported  in  [35],  we 
measured  the  value  j3  at  very  close  to  0  for  our  equipment. 

larger  values  of  r(P)  indicate  evidence  for  reliable  local  thresholds.  We  use  a  relaxation 
method  to  propagate  threshold  values  from  regions  having  reliable  values  to  the  regions  having 
unreliable  ones.  If  there  is  no  reliable  threshold  value,  we  resort  to  a.  global  binarization 
method,  but  this  situation  occurs  rarely  in  real  scenes.  For  those  local  regions  which  have 
reliable  threshold  values,  we  use  the  region  mean  or  the  median  as  the  threshold  value,  both  of 
which  perform  well.  The  algorithm  described  below  computes  threshold  values  for  every  pixel. 
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Figure  i:u  Example  of  binarization.  using  a  2-levd.  pyramid,  Tie  kin;  image  detects  a  scene 
containing  a  license  plate  under  uneven  iilumination.  The  middle  image  snows  the  result  of 
applying  a  single  global  threshold.  The  right  image  snows  1;h.e  .result  of  applying  an.  adaptive 
3o  cal  hi  na  ri  z  at  i  on  procedure. 

Il;  propagates  reliable  threshold  values  in.  level  1  first.  Then,  it  propagate  these  values  in.  level 

0. 

Mark  those  vertices  p  at  level.  1  having  reliable  thresholds  as  seed  points.  Set  the  gray 
values  La (p)  equal,  to  a  threshold  value  such,  as  the  mean.  p(  P),  Then.,  apply  the  soap  bubble 
algorithm  given,  by  equation.  (1 )  until,  convergent.  We  propagate  these  values  to  the  seed 
points  at  level  0,  i.e.  i(q)  =  L(p}  if  pnmmiq)  =  p.  We  assign,  a  constant  gray  val.ue  to 
the  remaining  graph,  vortices  at  levd.  0,  then,  again,  apply  algorithm  (  I  )  to  obtain  Individual 
threshold  values  for  every  pixel,  in.  the  graph.  Using  these  thresholds,,  we  binarize  the  image. 
The  image  in.  Figure  FI  is  a  license  plate  under  uneven  lighting  conditions.  We  show  the  .results 
of  applying  global  and  local  thresholding  to  it. 


5  Transformation  graphs 

Log  polar  images  have  mathematical  properties  that  mahe  them  attractive  for  certain  appli¬ 
cations  such  as  optical  how  and  navigation  [111  [12]  [115]  [21]  [2d]  [28]  [27]  [32]  [33]  [3-1]  [30]  [11] 
[its]  [-15]  [•!?]  [50]  [44].  For  an.  image  sensor  having  a  pixel  geometry  given  by  w  =  log  (zb  image 
scaling  is  equivalent  to  radial,  shifting,,  and  image  rotation  is  equivalent  to  annular  shifting. 
These  facts  alone  have  motivated,  much  of  the  wors  to  date  on.  fovea  ted.  sensors. 

But  the  space  variant  image  itan-miion  problem  has  until  now  remained,  cumbersome. 
Also*  for  a  iogmap  sensor  based,  on  defined  by  w  =  !og(z  -f  ok  the  shift  properties  of  scaling 
and.  .rotation,  are  only  approximations  of  the  actual  effects.  In.  respon.se  to  these  problems*  we 
dev  doped,  a  graph- based,  method  for  image  transformations.  The  graph's  vertices  are  pixels 
and  its  edges  represent  the  redistribution  of  gray  levels  effected,  by  a  transformation..  For  this 
reason*  the  graph,  is  called,  a  tnin^fonnaiion  graph. 
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Figure  12:  (a)  a.  TV  image  region  corresponding  to  a,  logmap  pixel,  (b)  the  effect  of  pixel 
translation  (c)  the  four  pixels  whose  gray  value  will  be  affected  by  this  translation. 


5.1  Translation  Graph 

The  transformation  graph  methods  for  translation,  rotation  and  scaling  are  all  similar.  All 
three  are  special  cases  of  more  general  image  mappings,  but  we  limit  the  discussion  to  these 
three  for  simplicity  of  exposition.  To  simplify  even  further,  let  us  first  focus  on  the  problem  of 
image  translation. 

One  approach  is  to  remap  the  image  back  to  TV  coordinates,  using  the  mapping  described 
in  section  1.2,  then  to  translate  the  TV  image  and  then  map  back  to  sensor  coordinates.  The 
drawback  to  this  method  is  that  it  takes  time  proportional  to  the  size  of  the  TV  image.  In 
this  section  we  show  how  to  use  a  set  of  graphs  to  implement  translation  in  time  proportional 
to  the  number  of  space  variant  image  pixels,  which  can  be  from  as  much  as  four  orders  of 
magnitude  smaller  than  a  T  V  image  having  the  same  field  of  view[27]. 

For  every  translation  T  =  (At,  A j)  we  construct  a  translation  graph  Gj  =  ( V.  Ej)  from  the 
connectivity  graph  G  =  (VfE).  In  order  to  translate  the  space  variant  image  by  the  offset  T 
we  need  to  compute  the  redistribution  of  gray  levels  in  the  translated  image.  Figure  12  shows 
that  the  translation  of  a  pixel  redistributes  its  gray  value  to  possibly  several  other  pixels. 

Given  a  translation  T  =  (Ai.Aj)  the  edge  set  Ej  contains  (p.  q)  provided 


3 ii.j)  |  R(i9j)  —  u{<1)  and  S(i.j)  =  v(q)  and 

R(i  -  Ai.j  -  Aj)  =  u(p)  and  S(i-  Ai.j-  A j)  =  v(p) 

Associated  with  every  edge  (p.  q)  £  Et  is  a  number 


KT(p,q)  =  Eitj  1  I  R(hj)  =  %)  and  S(i9j)  =  v\q)  and 

R{%  —  Ai.j  —  AJ)  —  u(p)  and  S(i  —  Ai.j  —  Aj)  —  v{p) 

indicating  the  number  of  TV  pixels  shifted  by  T  from  p  to  q. 

Using  the  translation  graph  (TG),  we  compute  the  translated  image  by  the  following  ex¬ 
pression: 
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~  Y,  *t(p,  q)L(p )  |  (p}  q)  £  Et- 


E  kt(p>  q) 

Figure  13  shows  a  plot  of  the  graph  Gj  for  the  vector  T  =  (0.8). 


5J 


Figure  13:  Translation  graph  for  the  translation  vector  T  =  (0,8). 


Any  given  vertex  pin  Gj  is  associated  with  a  set  of  edges  Er(p)  =  {(p,  q)  £  &t}-  Assuming 
that  \ET(p)\  is  bounded  above  by  a  constant,  then  computing  a  translation  by  (5)  takes  Oik) 
steps.  The  drawback  to  the  TG  is  that  storing  all  of  them  requires  a  large  amount  of  memory. 
If  we  allow  translations  to  every  TV  pixel  (i}j),  then  we  need  to  store  mn  graphs  of  size  0(k). 
Allowing  translation  only  to  the  k  centroids  p(p).p  £  V  requires  us  to  store  k  graphs,  giving 
a  total  storage  requirement  of  0{k2 )  locations. 

But  in  this  case  it  is  important  to  look  at  the  constant  factors.  The  combination  of  a 
sensor  with  thousands  of  pixels  with  a  computer  having  millions  of  memory  locations  is  quite 
feasible  at  very  low  cost.  Moreover,  the  transformation  graphs  contain  a  lot  of  redundant 
structure,  indicating  that  a  memory  representation  of  them  could  be  compressed.  Also,  many 
tracking  and  matching  applications  require  only  a,  small  set  of  cached  possible  translations. 
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These  considerations  make  the  transformation  gra.ph  approach  practical  despite  its  quadratic 
memory  cost. 


5.2  Rotation  and  Scaling  Graphs 

The  solution  to  the  rotation  and  scaling  problem  for  the  general  space  variant  sensor  geometry  is 
essentially  the  same  as  the  solution  for  translation.  After  analyzing  the  gray  level  redistribution 
by  rotating  the  original  image,  we  can  construct  a.  rotation  graph ,  defined  analogously  to  the 
translation  graph.  To  simplify  the  problem,  let  ns  assume  that  rotations  must  be  integer 
multiples  of  the  angular  sampling  period  of  the  logmap  sensor.  Call  the  angle  between  adjacent 
spokes  of  the  logmap  the  element  angle.  We  consider  only  rotations  that  a.re  integer  number  of 
element  angles.  If  there  are  s  spokes,  we  construct  only  the  s  rotation  graphs  for  all  possible 
rotations.  Figure  14  shows  a  rotation  graph  which  rotates  the  image  by  the  one  element  angle. 


Figure  14:  The  rotation  graph  shown  in  forward  (left)  and  inverse  (right)  coordinates  of  a, 
rotation  by  one  element  angle. 

By  a  similar  analysis,  we  can  construct  the  scale  graph.  Call  the  radius  ratio  of  radii  in 
two  consecutive  rings  the  element  ratio.  If  the  sensor  has  r  rings,  we  need  to  compute  the  r 
scale  graphs  for  scaling  up  the  image  and  r  scale  graphs  for  scaling  down.  In  figure  15  we  show 
a,  graph  which  scales  the  logmap  image  by  one  element  ratio.  In  figures  16  and  17,  we  show 
some  examples  of  rotated  and  scaled  logmap  images. 

The  rotation  and  scaling  graphs  also  contain  a  lot  of  redundant  structure,  and  their  number 
is  only  linear  in  the  number  of  spokes  or  rings.  But  for  our  logmap  sensor,  much  of  the 
rotation  and  scaling  graphs  is  redundant.  What  these  representations  do  is  to  make  explicit 
the  exceptions  to  the  mathematical  equivalence  of  scaling  and  rotation  to  image  shifting. 


Translation  inverse  map 


Inverse  map 


Figure  I  ft:  Rotation  oflogmap  images.  The  left  image  pair  is  the  original  iogmap  image.  The 
center  pair  is  rotated  l>;v  one  element  angle.  The  right  pair  shows  a  larger  rotation. 
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Figure  IT:  Scaling  of  logrnap  images.  The  left  image  pair  is  scaled  down  and  the  right  image 
pair  is  scaled  up. 


6  Template  matching 

The  translation  graph  enables  us  to  solve  the  template  matching  problem  for  logrnap  images. 
Suppose  we  have  a  logrnap  image  template  given  by  the  graph  Gm  —  i  Vm ,  0).  (Jail  the  logmap 
pixel  values  L^(p).  p  6  V.m-  -hi  general,  t  he  vertices  V'm  C  V.  The  edge  set  for  the  template 
model  is  empty,  because  to  define  the  template  model  we  need  to  consider  only  the  pixels,  not 
relations  between  them. 
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Figure  18:  The  concept  of  pixel  coverage  for  translation. The  left  box  depicts  a  masked  region 
to  be  translated.  The  right  box  shows  the  pixel  coverage  for  a  translation  along  the  x  axis  by 
8  TV  pixels 

When  we  translate  the  template  image  by  a.  vector  T  =  (Ai,Aj),  it  overlays  another  set 
of  logmap  pixels.  Some  of  the  second  set  fall  completely  under  the  translated  template,  but  it 
covers  others  only  partially.  In  order  to  define  the  matching  error,  we  need  to  formalize  this 
concept  of  coverage ,  illustrated  by  Figure  18.  We  define  the  image  graph  Gmt  =  (VvfT-  0)  in 
the  following  way.  A  pixel  q  E  Vmt  if  and  only  if  its  area. 

«(?)  =  E  Mm ). 

p€VM  and  (p,q)€ET 

We  compute  the  translated  template  ima.ge  using  the  translation  graph: 

UfT(q)  =  -1—  E  KT{p,q)LM{p) 

’  P€Vm  and  {p,q)^ET 

An  area,  weighted  matching  value  between  a  translated  template  image  and  a  data  image  Lip) 
is  defined  by 

Y.  «(?)( £(?)  -  £mt(?))2/  Y  a(i ) 

q€VMT  qevMT 

The  best  matching  position  T  is  the  one  having  minimum  matching  value. 

Figure  19  shows  a  template  image  displayed  in  TV  coordinates.  Figure  20  shows  the  data 
image  and  the  results  of  template  matching.  Note  that  we  bounded  the  matching  radius  in 
order  to  get  high  pixel  coverage. 
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Figure  39:  The  template  image. 


7  Conclusion 

The  Iogmap  sensor  presents  a  special  image  processing  problem  because  of  its  irregular  pixel 
geometry.  Pixels  t  hat  are  adjacent  in  the  sensor  may  not  be  adjacent  in  an  array  representation 
of  the  sensor  image.  Tor  example,  in  the  Iogmap  image  defined  by  w  =  logfz  -f  a)  the  pixels 
along  the  vertical  meridian  are  not  connected  to  all  of  their  neighbors  in  the  array.  Conventional 
image  processing  functions,  defined  over  arrays,  do  not  produce  adequate  results  w  hen  applied 
to  such  space  variant  images.  By  explicitly  representing  the  neighbor  relations  in  space  variant 
images,  the  connectivity  graph  data  abstraction  allows  us  to  define  image  processing  operations 
for  space  variant  sensor  data. 

We  have  implemented  a  prototy  pe  real-time  miniaturized  active  vision  system  based  on  the 
iogmap  sensor  geometry.  Even  with  the  stringent  memory  limitations  (less  that  J28K  bytes) 
of  our  prototype  system,  we  implemented  image  operations  using  the  CG.  These  operations 
became  the  components  of  the  more  complex  applications  to  license  plate  reading,  motion 
tracking,  and  actuator  calibration  using  visual  sensory  feedback 

Local  image  operators  such  as  edge  detectors  and  relaxation  operators  can  be  defined 
easily  in  the  CG.  These  operations  are  independent  of  the  sensor  geometry.  As  a  side  effect, 
the  local  operators  are  defined  everywhere  in  the  space  variant  image,  even  at  the  image 
boundaries.  Therefore,  there  art*  no  special  cases  in  CG  local  operator  definitions  for  image 
boundary  conditions.  In  this  paper  we  showed  local  operators  for  edge  detection  and  relaxation. 
The  latter  operator  enabled  ns  to  define  an  image  binarization  operation  for  images  having 
non  uniform  i Humiliation . 

One  slightly  cumbersome  issue  is  the  definition  of  local  operators  involving  more  than  one 
pixel  and  its  immediate  neighbors.  We  have  shown  that  such  operators  can  be  defined  in 
the  (JG,  however,  and  they  have  the  same  generality  as  the  immediate  local  operations.  A 
smoothing  operator  was  presented  as  an  example  of  an  enlarged  local  operation. 

Building  on  the  work  of  Montan vert  et.  al  [22],  we  can  also  define  pyramid  structures 
over  the  CG.  In  this  paper  we  touched  upon  the  use  of  such  pyramids,  and  showed  how  to 
implement  a  local  binarization  operator  in  a  2-level  pyramid. 
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Figure  20:  The  left  coin  mu  shows  the  target  image  in  the  forward  and  inverse  logmap  coordi¬ 
nates.  At  the  right  is  the  result  of  template  matching.  The  brightest  pixel  in  the  result  map 
corresponds  to  the  best  matching  position. 

Transformation  graphs  are  relatives  of  connectivity  graphs  that  represent  the  effects  of 
transformations  such  as  translation  and  rotation  in  space  variant  images.  Logmap  images 
in  particular  are  known  to  have  elegant  mathematical  properties  with  respect  to  scaling  and 
rotation,  but  translation  of  the  data  from  these  sensors  has  until  now  been  problematic.  If 
we  have  enough  memory  to  represent  all  translation  graphs  for  a  particular  sensor  having  k 
pixels,  t  hen  we  can  compute?  a  translated  image  in  0(k)  steps.  Our  primary  application  of  t  he 
translation  graph  is  template  matching.  We  show  the  results  of  template  matching  in  a  space 
variant  image.  The  template  matching  operation  is  the  basis  for  a  target  tracking  program 
that  requires  processing  only  at  the  low  data  rate  of  the  logmap  sensor. 

Image  processing  operations  defined  in  the  OG  are  independent  of  the  sensor  geometry.  We 
developed  a  library  of  image  processing  routines  that  wor  k  on  images  from  a  variety  of  sensors. 
The  OG  image  processing  library  will  be  useful  as  we  continue  to  experiment  with  new  sensor 
geometries,  and  as  we  begin  to  use  future  VLSI  implementations  of  logmap  sensors.  " 

3 The  authors  would  like  to  thank  Dr.  Ken  Goldberg  of  the  University  of  Southern  California  for  his  valuable 
comments  on  an  early  draft  of  this  paper.  We  would  also  like  to  thauk  the  anonymous  referees  for  their 
substantive  remarks  and  constructive  criticisms.  We  gratefully  thank  .lack  Judson  of  Texas  Instruments  for  Ids 
technical  assistance  in  the  development  of  our  logmap  sensor. 
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